Goto

Collaborating Authors

 transfer learning overview


Transfer Learning Overview -- Episode 1 – Above Intelligent (AI)

#artificialintelligence

The great strength of CNN architectures is their capability to automatically learn a hierarchy of feature detectors in order to solve some task. What it has been observed is that regardless of the architecture, the dataset and the target semantic space (and of course the initialization assuming it's random), the first layers seem to always converge to specific kinds of feature detectors: the Gabor Filters. This is actually a very interesting and important phenomenon as it seems to suggest the Gabor Filters are the most efficient way to start the semantic extraction process from an image. It would mean Gabor Filters block is a sort of "generic building block" which could be used to design NN aimed at solving computer vision problems This is one of the main goal of Transfer Learning: finding "building blocks" which can be composed to build a NN and fine tuned, instead of trained from scratch, on the Dataset Being able to properly understand how the CNN specializes while training is important to get to Transfer Learning: "transferring" the network "capability of solving a problem" which means basically adapting its weights properly, to another similar problem in a Data Efficient Way The data efficiency is in fact one of the most important aspects of transfer learning: it is well known that Supervised Learning is an effective way to make a certain, typically big, NN become able to solve a problem but it scales badly in terms of data as it typically requires A LOT OF supervision signal which, in case of manual annotation, is expensive to collect as it relies on humans to provide it. Furthermore the more the task is difficult, the more the annotations need to be provided by human experts instead of normal people and the former one's time is more expensive than the latter ones